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Abstract 

We continue the study of communication cost of computing functions when inputs are distributed among 
k processors, each of which is located at one vertex of a network/graph called a terminal. Every other node of 
the network also has a processor, with no input. The communication is point-to-point and the cost is the total 
number of hits exchanged hy the protocol, in the worst case, on all edges. 

Chattopadhyay, Radhakrishnan and Rudra (FOCST4) recently initiated a study of the effect of topology of 
the network on the total communication cost using tools from Li embeddings. Their techniques provided tight 
bounds for simple functions like Element-Distinctness (ED), which depend on the 1-median of the graph. This 
work addresses two other kinds of natural functions. We show that for a large class of natural functions like Set- 
Disjointness the communication cost is essentially n times the cost of the optimal Steiner tree connecting the 
terminals. Further, we show for natural composed functions like ED o XOR and XOR o ED, the naive protocols 
suggested by their definition is optimal for general networks. Interestingly, the bounds for these functions depend 
on more involved topological parameters that are a comhination of Steiner tree and 1-median costs. 

To obtain our results, we use some new tools in addition to ones used in Chattopadhyay et. al. These include (i) 
viewing the communication constraints via a linear program; (ii) using tools from the theory of tree embeddings to 
prove topology sensitive direct sum results that handle the case of composed functions and (iii) representing the 
communication constraints of certain problems as a family of collection of multiway cuts, where each multiway 
cut simulates the hardness of computing the function on the star topology. 
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1 Introduction 


We consider the following distributed computation problem p = (/, G,K,1.): there is a set K of k processors that 
have to jointly compute a function / : 2^ ^ {0,1}. Each of the k inputs to / is held by a distinct processor. Each 
processor is located on some node of a network (graph) G = {V,E). These nodes in V with an input are called 
terminals and the set of such nodes is denoted by K. The other nodes in V have no input but have processors 
that also participate in the computation of / via the following communication process: there is some fixed a- 
priori protocol according to which, in each round of communication, nodes of the network send messages to 
their neighbors. The behavior of a node in any round is just a (randomized) function of inputs held by it and the 
sequence of bits it has received from its neighbors in the past. AH communication is point-to-point in the sense that 
each edge of G is a private communication channel between its endpoints. In any round, if one of the endpoints 
of an edge is in a state where it expects to receive some communication from the other side, then silence from 
the other side is not allowed in a legal protocol. At the end of communication process, some pre-designated node 
of the network outputs the value of / on the input instance held by processors in K. We assume that protocols 
are randomized, using public coins that are accessible to all nodes of the network, and err with small probability. 
The cost of a protocol on an input is the expected total number of bits communicated on all edges of the network. 
The main question we study in this work is how the cost of the best protocol on the worst input depends on the 
function /, the network G and the set of terminals K. This cost is denoted by Re{p) (and we use R{p] to denote 
Riislp))- It is not difficult to see that this cost is lower bounded by the expected cost (of the best protocol) under 
any distribution p over the inputs to nodes in K. This latter quantity is denoted by Re,^{p) and turns out to be 
easier to lower bound under a conveniendy chosen p. 

This communication model seems to be a natural abstraction of many distributed problems and was recently 
studied in its full generality by Chattopadhyay, Radhakrishnan and Rudra [11) 0 A noteworthy special case is when 
G is just a pair of nodes connected by an edge. This corresponds to the classical model of 2-party communication 
introduced by Yao (45l more than three decades ago. The study of the classical model has blossomed into the 
vibrant and rich field of communication complexity, which has deep connections to theoretical computer science 
in general and computational complexity in particular. 

This point-to-point model had received early attention in the works of Tiwari (^, Dolev and Feder [15l and 
Duris and Rolim [17]. These early works seem to have entirely focused on deterministic and non-deterministic 
complexities. In particular, Tiwari [^ showed several interesting topology-sensitive bounds on the cost of de¬ 
terministic protocols for simple functions. However, these bounds were for specific graphs like trees, grids, rings 
etc. More recendy, there has been a resurgence of interest in the randomized complexity of functions in the point- 
to-point model. These have several motivations: BSP model of Valiant [40), models for MapReduce [^, par¬ 
allel models to compute conjunctive queries [7), distributed models for learning [4], distributed streaming and 
functional monitoring [T^, sensor networks [2^ etc. Interestingly, in a very recent work Drucker, Kuhn and Osh- 
man [16] showed that some outstanding questions in this model (where one is interested in bounding the num¬ 
ber of rounds of communication as opposed to bounding the total communication) have connections to well 
known hard problems on constant-depth circuits. Motivated by such diverse applications, a flurry of recent 
works [^|42]|43l|44l[^|22|23[T0] have proved strong lower bounds, developing very interesting techniques. All of 
these works, however, focus on the star topology with k leaves, each having a terminal and a central non-terminal 
node. Note that every function on the star can be computed using 0{kn) bits of communication, by making the 
leaves simultaneously send each of their u-bit inputs to the center that outputs the answer. The aforementioned 
recent works show that this is an optimal protocol for various natural functions. 

In contrast, on a general graph not all functions seem to admit 0{kn) -bit protocols. Consider the naive protocol 
that makes all terminals send their inputs to a special node u. The speciality of u is the following: let the status 
of a node v in network G w.r.t. K, denoted by afciv), be given by w), where dG{x,y) is the length of 

a shortest path in G between nodes x and y. Node u is special and called the median as it has a minimal status 
among all nodes, which we denote by okIG). Thus, the cost of the naive protocol is okIG) ■ n. For the star, the 
center is the median with status k. On the other hand, for the line, ring and grid, each having k nodes aU of which 
are terminals, okIG) is 0(fc^), 0(fc^) and 0(fc^^^) respectively. 

delated but different problems bave been considered in distributed computing. Please see Appendix|A|for more details. 
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The work in (m appears to be the first one to address the issue of randomized protocols over arbitrary G. It 
shows simple natural functions like Element-Distinctnesfl have ©(crjr (G)) as the cost (up to a poly-log(fc) factor) 
of the OTtimal randomized protocol computing them. While these are essentially the strongest possible lower 
boundqS not all functions of interest have that high complexity. Consider the function Equality that outputs 1 
precisely when all input strings at the nodes in K are the same. There is a randomized protocol of cost much less 
than ajclG] for computing it: consider a minimal cost Steiner-tree with nodes in K as the terminals. Let the cost of 
this tree be denoted by ST {G,K). Root this tree at an arbitrary node. Each leaf node sends a hash (it turns out 0(1) 
bits of random hash suffices for our purpose^ of its string to its parent. Each internal node u collects all hashes 
that it receives from nodes in the sub-tree rooted at u, verifies if they are all equal to some string s. If so, it sends s 
to its parent and otherwise, it sends a special symbol to its parent indicating inequality. Thus, in cost O (ST (G, K)), 
one can compute Equality with small error probabilityH 

For many scenarios in a distributed setting, the task to be performed is naturally layered in the following way. 
The set of terminal nodes is divided into t groups Within a group of m terminals, the input needs to 

be pre-processed in a specified manner, expressed as a function g : ({0,1}”)”* —► {0,1}”. Finally the results of the 
computation of the groups need to be combined in a different way, given by another function /: ({0,1}”) ^ ^ {0,1}. 
More precisely, we want to compute the composed function fog. The canonical protocol will first compute in 
parallel all instances of the task g in groups using the optimal protocol for g and then use the optimal protocol for f 
on the outputs of g in each of Ki. Elowever, this is not the optimal protocol for all /, g and network G. For example, 
consider the case when / is Equality and g is the bit-wise XOR function. As we show later, the optimal protocol 
for computing XOR has cost 0 (ST (G, K)-n). Elence, the naive protocol for EQ oXOR will have cost Q( (ST (G, ki')) -i- 
(ST {G,Ki) -n)). Elowever, it is not hard to see that there is a protocol of cost 0(t- (ST {G,K)). This cost can be 
much lower than the naive cost depending on the network. 


2 Our Results 


The first part of our work attempts to understand when the naive protocol cannot be improved upon for composed 
functions. Function composition is a widely used technique in computational complexity for building new func¬ 
tions out of more primitive ones (36l[T9][20lO|23. Proving that the naive way of solving /o g is essentially optimal, 
in many models remain open. In particular, even in the 2-party model of communication where the network is just 
an edge, this problem stiU remains unsolved (see (6)). To describe our results on composition, we need the follow¬ 
ing terminology: The cost of solving a problem (/, G, iL, {0,1}”) will have a dependence on both n and the topology 
of G. We will deal with two kinds of dependence on n. If the cost depends linearly on n, we say / is of linear type. 
Otherwise, there is no dependence on n. (We typically ignore poly-log factors in this paper.) Call / a l-median 
type function if its topology-sensitive complexity is gk (G) . We say / is of Steiner tree type, if its topology-sensitive 
complexity is ST(G,k0. The protocol for a Steiner tree type problem / seems to move information around in a 
fundamentally different way from the one for a l-median type problem g. It seems tempting to expect that there 
composition cannot be solved by any cheaper protocol than the naive ones. However, we are only able to prove 
this intuition for few natural instances in this work. 

Consider the following composition: the first function is element distinctness function, denoted by ED, which 
was shovm by (TT) to be of l-median type. The second is the bit-wise xor function (which we denote by XOR^), 
which is shown to be of linear Steiner-tree type later in Appendrx|B] In particular, given a graph G = {V,E) and 


dsf 

t subsets K\,...,Kt c V, we define the composed function EDoXORn as follows. Given ki = |kr,j n-bit vectors 

Xi, e {0,1}" for every i e [f], define EDoXOR„ (xj, ..., ,..., X/, ... , xQ = ED (xOR„ (x|,. . ., X^ J ,.. . ,XOR„ (X|,..., Xfc, 


^Given inputs e 2 for every i e K, the function ED : 2^ {0,1} is defined as follows: = 1 if and only if X* ^ for every i 

jeK. 

^Strictly speaking, the strongest lower bound is C1{.(Jk (G) • n). Several functions, called linear l-median type later, are shown to achieve this 
bound in 1111 . 

^Observe that if two strings held at two terminals are not equal, each hash will detect inequality with probability 2/3. 

^In fact, we observe in Theorem[9]that any function /: 2^ ^ {0,1} that depends on all of its input symbols needs G{ST{G, X)) amounts of 
communication (even for randomized protocols), which implies that the randomized protocol above for Equality is essentially optimal. 
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The naive algorithm mentioned earlier specializes for EDoXORn as follows: compute the inner hit-wise XOR’s firs{^ 
and then compute the ED on the intermediate values. This immediately leads to an upper hound of 


O 



( 1 ) 


where (Tki,...,k, (G) is the minimum of ctkCG) for every choice of K that has exactly one terminal from Kf for every 
/ E [f]. One of our results, stated below, shows that this upper bound is tight to within a poly-log factor: 


Theorem 1. 


R(EDoXOR„,G,ii:,{0,l}”) >0 


logt 


log|l^|loglog|l^|^ 


We prove the above result (and other similar results) by essentially proving a topology sensitive direct sum 
theorem (see Section lSTI for more). 

To get a feel for how (l) behaves between the two extremes consider the case when G is a \/k x grid and 
the set of k terminals (i.e. all nodes are terminals) is divided into t sets of size k! t, where each Kt for i e [t] is a 
X sub-grid. It can be verified that in this case (I) is (up to an 0{\ogk) factor) t\/k + k. In Section lE.3.31 
we further show that changing the order of composition to XORo ED also does not allow any cost savings over the 
naive protocol: 


Theorem 2. For every choice ofuiE Kt: 


i?(XORioED,G,ii:,{0,l}") >0 


ST(G, {Ui ,..., Ui]) -H 




logfc 


The results discussed so far follow by appropriately reducing the problem on a general graph to a bunch of 
two-party lower bounds, one across each cut in the graph. This was the general idea in [TT] as well but the re¬ 
ductions in this paper need to use different tools. However, the idea of two-party reduction seems to fail for the 
Set-Disjointness function, which is one of the centrally studied function in communication complexity. In our set¬ 
ting, the natural definition of Set-Disjointness (denoted by DISJ) is as follows: each of the k terminals in K have an 
n-bit string and the function tests if there is an index i e [n] such that all k strings have their ith bit set to 1. It is easy 
to check that this function can be computed with 0(ST(G,X) ■ n) bits of communication (in fact one can compute 
the bit-wise AND function with this much communication by successively computing the partial bit-wise AND as 
we go up the Steiner tree). Before our work, only a tight bound was known for the special case of G being a fc-star 
(i.e. a lower bound of D(fcn)), due to the recent work of Bravermanetal. (9). In this work, we present a fairly general 
technique that ports a tight lower bound on a k-star to an almost tight lower bound for the general graph case. For 
the complexity of Set-Disjointness, this technique yields the following bound: 


Theorem 3. 


R(DIS1,G,X,{0,1}”)>Q 


ST'{G,K]-n \ 
log2 k I ■ 


Next, we present our key technical results and an overview of their proofs. We would like to point out that 
our proofs use many tools used in algorithm-design like (sub)tree embeddings, Boriivka’s algorithm to compute 
an MST for a graph and integrality gaps of some well-known LPs, besides using Li-embeddings of graph that was 
also used in m- We hope this work encourages further investigation of other algorithmic techniques to prove 
message-passing lower bounds. 

®In fact, we just need to compute the XOR of the hashes of the input, which with a linear hash is just the bit-wise XOR of 0(logfc)-bits of 
hashes. 
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3 Key Technical Results and Our Techniques 

In Appendix[B]we present a simple formulation of communication lower bounds in terms of a linear program (LP), 
whose constraints correspond to two-party communication complexity lower bounds induced across various cuts 
in the graph G. In particular, we prove our earlier claimed lower bound of Q(ST(G,i<r) ■ n] for the XOR„ problem. 
Further, this connection can also be used to recover the (G) / log k) lower bound for the ED function from fTTl - 
see Theorem[TT] While LPs have been used to prove communication complexity lower bounds in the standard 2- 
party setting (see e.g. |^|38l), our use of LPs above seem to be novel for proving communication lower bounds. In 
the remainder of the section, we present two general results that we will use to prove our lower bounds for specific 
functions including those in Theorems[Tl|2]and|3l (See Appendix[E]for the details.) 

3.1 A Result on Two LPs 

We now present a result that relates the objective values of two similar LPs. Both the LPs will involve the same 
underlying topology graph G- {V,E). 

We begin with the first LP, which we duh LP^(G): 


min ^ Xe 

e£E 


subject to 


e 

E b‘ (C) for every cut G 

e crosses C z=l 

Xe > 0 for every eeE. 

In our results, we will use Xe to denote the expected communication of an arbitrary protocol for a problem p over a 
distribution over the input. The constraint for each cut G will correspond to a two-party lower bound of b‘ (C). 
Then the objective value of the above LP, which by abuse of notation we will also denote by LP^(G), will be a valid 
lower bound on R{p). 

Next we consider the second LP, which we dub LP^(G): 

e 

min^ Y Xi.e 

i=l eeE 


subject to 


Y ^i,e s b' (G) for every cut G and i e \£] 

e crosses C 

Xi^e > 0 for every eeE and i e [£]. 

In our results, we will connect the objective value of the above LP (which again with abuse of notation we will 
denote by LP^(G)) to the total communication of a trivial algorithm that solves problem p. 

Our main aim is to show that for certain settings, the lower bound we get from LP^(G) is essentially the same 
as the upper bound we get from LP^(G). 

Before we state our main technical result, we need to define the property we need on the values h'(C). In 
particular, let d(G) denote the set of crossing edges for a cut C. We say that the values b‘ (C) satisfy the sub-additive 
property it for any three cuts Gi, C 2 and C 3 such that Gi U C 2 = CsQ we have that for every i e [£]■. b‘ (C 3 ) < b' (Gi) - 1 - 
h' (C 2 ). We remark that the two main families of functions that we consider in this paper lead to LPs that do satisfy 
the sub-additive property (see AnnendixlPj . We are now ready to state our first main technical result: 

^This means that one side of the cut C 3 is the union of one side each of Ci and C 2 . 
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Theorem 4. For any graph G- (V,E) (and values fo' (C) for any i e [t] and cut C with the sub-additive property), we 
have 


LP^(G)>LP^(G)>Q 


log|V^|loglog|V^| 


■LP^(G). 


TheoremlHis the main ingredient in proving the lower bound for a 1-median function composed with a Steiner 
tree function as given in Theorem [T] (see Appendix lE.3.11 . We can also use Theorem|l]to prove nearly tight lower 
bound for composing a Steiner tree type function XOR with a linear 1-median function IP as well as another 1- 
median function ED. Elowever, it turns out for these functions, we can prove a better bound than Theorem!?] In 
particular, using techniques developed in [TT], we can prove lower bounds given in Theorem|2|and the one stated 
below (see Appendix lE.3.2l for details): 


Corollary 1. For every choice ofui e Kt: 


i?(XORoIP„,G,^:,{0,l}") >o 


ST(G, {ui,, Uf}) + 


S/=1 (G) ■ I 


log A; 


3.1.1 Proof Overview 


We give an overview of our proof of Theorem|4| (specialized to the proof of Theorem]!). While the LP based lower 
bound argument for XOR^ in Appendix]B]is fairly straightforward things get more interesting when we consider 
EDoXORn. It turns out that just embedding the hard distribution for ED from (TT), one can prove a lower bound of 
just o|—(see Lemma lE.21 . The more interesting part is proving a lower bound of ST(G,isr/)). It is 

not too hard to connect the upper bound of ST(G,i<rj)) to the following LP, which we dub LP^j{G,K) (and is 

a specialization of LP^(G)): 

t 

min^ ^ Xi_e 
i=l e£E 


subject to 


^i,e — 1 

e crosses C 

^i,e — 0 


for every cut C that separates K and / e [ f] 
for every ee E and i e[t]. 


Indeed the above LP is basically solving the sum of t independent linear programs: call them LP$j[G,Ki] for each 
/ E [ f]. Lienee, one can independently optimize each of these LPst (G, Ki) and then just put them together to get an 
optimal solution for LP^(G, K). This matches the claimed upper bounds since it is weU-known that the objective 
value of LPst (G,^;) is 0(ST(G,L:/)) (H). 

On the other hand, if one tries the approach we used to prove the lower bound for XOR^, then one picks an 
appropriate hard distribution p and shows that for every cut C the induced two-party problem has a high enough 
lower bound. In this case, it turns out (see Section lE.3.11 that the corresponding two-party lower bound (ignoring 
constant factors) is the number of sets Ki separated by the cut. Then proceeding as in the argument for XOR„ if 
one sets yg to be the expected (under p) communication for any fixed protocol over any e e E, then (jeleeB is a 
feasible solution for the following LP, which we dub LP^^ (G, K) (and is a specialization of LP^ (G)): 

min Xg 

beE 

subject to 

Xe> v{G, K) for every cut C 

e crosses C 

Xg > 0 for every eeE, 

where v{C,K) is the number of subsets Ki that are separated by C. If we denote the objective value of the above LP 
hy LP^j{G,K), then we have an overall lower bound of D,{LP^j{G,K)). Thus, we would be done if we can show that 
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LP^jiG, K) and LP^(G, K) are close. It is fairly easy to see that LP^j{G, K) < LP^(G, K). However, to prove a tight 
lower hound, we need an approximate inequality in the other direction. We show this is true hy the following two 
step process: 

1. First we observe that if G is a tree T then LP^^ (T,K)- LP^j{T,K). 

2. Then we use results from embedding graphs into sub-trees to show that there exists a subtree P of G such 
that LP^j(G,K) =: LP^j{T,K] and LP^j{G,K] =; LP^j(T,K), which with the first step completes our proof. 

We would like to remark on three things. First, our proof can handle more general constraints than those imposed 
by the Steiner tree LP. In particular, we generalize the argument above to prove Theorem H) Second, to the best 
of our knowledge this result relating the objective values of these two similar LPs seems to be new. However, we 
would like to point out that our proof follows (with minor modifications) a similar structure that has been used to 
prove other algorithmic results via tree embeddings (e.g. in (2). Third, we find it interesting to observe that the 
upper bound on the gap between the two LP’s is the key step in accomplishing a distributed direct-sum like result. 


3.2 From Star to Steiner Trees 

We define a multicut C of isT to be a collection of non-empty pair-wise disjoint subsets Gi,..., Cr of K. Each such 
subset is called an explicit set of C and the (maybe empty) set K\uf^^Gi is called its implicit set. We will call 

/ : ^ {0,1} to be h-maximally hard on the star graph if the following holds for any multicut C. There exists a 

f f 

distribution such that the expected cost (under p^) of any protocol that correctly computes / on the following 
star graph is 0(|G| • h(|Z|)): each leaf of the star has all terminals from an explicit set from G, no two leaves have 
terminals from the same explicit set and the center contains terminals from the implicit set. The following is our 
second main technical result: 


Theorems. Let f be h-maximally hard on the star graph. Then 


Rif, G,k',Z)>Q 


SIiG,K)-h[\'L\) 


The above result easily implies the lower bound (see Section|R2]) in Theorem|3l Theorem[5]can also be used 
to prove a lower bound similar to Theorem|3]above for the Tribes function using the lower bound for Tribes on the 
star topology from |T0] . We defer the proof of this claim to the full version of the paper. 


3.2.1 Proof Overview 

In aU of the arguments so far, we reduce the lower bound problem on (G, K) to a bunch of two party lower bounds 
induced by cuts. However, we are not aware of any hard distribution such that one can prove a tight lower bound 
that reduces the set disjointness problem to a bunch of two-party lower bounds. In fact, the only non-trivial lower 
bound for set disjointness, in the point-to-point model, that we are aware of is the Llikn) lower bound for the k- 
star by Braverman et al. (9) . In particular, their proof does not seem to work by reducing the problem to two-party 
lower bounds. In this work, we are able to extend the set disjointness lower bound of (9) to Theorem|3] 

We prove Theorem|3]by modifying the argument in (11] as follows. Essentially the idea in (TTj is to construct a 
collection of cuts such that essentially every edge participates in 0(logA:) cuts and one can prove the appropriate 
two-party lower bound across each of the cuts in the collection so that when one sums up the contribution from 
each cut one gets the appropriate 0((7jf (G)/logA:) overall lower bound. (These collection of cuts were obtained via 
Bourgain’s L\ embedding (Bjl^. As mentioned earlier, this trick does not seem to work for set disjointness and it 
is very much geared towards 1-median type functions). We modify this idea as follows: we construct a collection 
of multi-cuts such that (i) every edge in G appears in at most one multi-cut and (ii) one can use lower bounds on 
star graph to compute lower bounds for the induced function on each multi-cut, which can then be added up. 

The main challenge in the above is to construct an appropriate collection of multi-cuts that satisfy properties (i) 
and (ii) above. The main idea is natural: we start with balls of radius 0 centered at each of the k terminals and then 
one grows all the balls at the same rate. When two balls intersect, we combine the two balls and grows the larger ball 
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appropriately. The multi-cut at any point of time is defined by the vertices in various balls. To argue the required 
properties, we observe that the algorithm above essentially simulates Boruvka’s algorithm l3^ on the metric closure 
of K with respect to the shortest path distances in G. In other words, we show that the sum of the contributions 
of the lower bounds from each multi-cut is related to the MST on the metric closure of K with respect to G, which 
is weU-known to be closely related to SI{G,K) (see e.g. [41] Chap. 2]). It turns out that for set disjointness, one 
has to define 0(log k) different hard distributions (that depend on the structure of the multi-cuts above) and this 
is the reason why we lose a 0(log k) factor in our lower bound. (We lose another 0(log k) factor since we use lower 
bounds on the star topology.) To the best of our knowledge this is the first instance where the hard distribution 
actually depends on the graph structure- most of our results as well as those preceding ours use hard distributions 
that are independent of the graph structure. This argument generalizes easily to prove Theorem^ 


4 Open Questions 

We conclude by pointing out two of the many open questions that arise from our work: 

1. Our two main technical tools are complementary. Theorem|4]works for the case when the set of terminals 
K is divided into sets K\,...,Kt and one applies some inner functions on these kT/’s. Theorem fallows us 
to prove a sort of direct sum result in this case. However, this technique reduces the problem on {G,K) to a 
bunch of two-party lower bounds. On the other hand, Theorem0transforms the problem on (G, K) to lower 
bounds on star graphs. However, this cannot prove a direct sum type lower bound (and also only handles 
Steiner tree type constraints). A natural question to ask is if one can get the best of both worlds, i.e. can we 
show a direct sum type lower bound of the kind n(Z/=i ST(G, kT/)) by reducing the problem to a bunch of 
lower bounds on the star topology? 

2. In this paper we only present results for specific fog. It would be nice to prove our conjecture from the 
introduction: if the inner function is a (linear) Steiner tree type and the outer function is a (linear) 1-median 
type function, then the trivial two-stage algorithm is optimal for fog. There are several avenues to pursue 
this. One such is to extend the XOR lemma (which corresponds to proving that the naive protocol is optimal 
forXORo g) of Barak et al. from the two-party communication setting to ours (as long as g is of 1-median 
type). 
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Notes on the Appendix 

Further, some of our results hold for the case when more than one input is assigned to the same terminal, i.e. we 
have a multi-set of terminals. In the appendix, we will use to denote the case of the set of terminals being a 
multi-set and K to denote the case that the set of terminals is a proper set. 


A Related Work in Distributed Computing 

Not surprisingly, the role of topology in computation has been studied extensively in distributed computing [34] . 
There are three main differences between works in this literature and ours. First, the main objective in distributed 
computation is to minimize the end to end delay of the computation, which in communication complexity termi¬ 
nology corresponds to the number of rounds need to compute a given function. By contrast, we mostly consider 
the related but different measure of the total amount of communication. Second, the effect of network topology on 
the cost of communication has been analyzed to quite an extent when the networks are dynamic (see for example 
the recent survey of Kuhn and Oshman [27]). By contrast, in this paper we are concerned with static networks of 
arbitrary topology. Finally, there has also been work on proving lower bounds for distributed computing on static 
networks, see e.g. the recent work of Das Sarma et al. (m. This line of work differs from ours in at least two ways. 
First, their aim is to prove lower bounds on the number of rounds needed to compute, especially when the edges 
of the graph are capacitated. This paper, on the other hand, focuses on the total communication needed with¬ 
out placing any restriction on the capacities of the edges or the number of rounds involved. Second, the kinds of 
functions considered in the distributed computing community (for recent papers see e.g. [14]|28l[T6l) are generally 
of a different nature than the kinds of functions that we consider in this paper (which are more influenced by the 
functions typically considered in the communication complexity literature). For many functions in distributed 
computing, the function / itself depends on G (e.g. computing the diameter of G, the cost of the MST of G etc.) 
while all the functions we consider are independent of G- indeed we want to keep the function / the same and 
see how its communication complexity changes as we change G. Further, even for the case when / is independent 
of G (e.g. sorting) typically one has k = n and V = while in our case we have arbitrary JK and |l^| and n are 
independent parameters. (There is a very recent exception in (251 .) 


B Communcation Complexity Lower Bounds via LPs 

A basic idea in our technique, is to understand the topological constraints placed on the communication demands 
of the problem by considering cuts of a graph. The general idea of using cuts for this purpose has appeared in many 
places before like network coding (ex: (^|3T]|^ and function computation in sensor networks (ex: |2^). But the 
idea of using several cuts rather than a single cut that we describe next is primarily borrowed from (TT) (similar 
though slightly less general arguments were also made in (^|^). The original problem (/, {0,1}”, G, K) naturally 
gives rise to a classical 2-party problem across a cut C = (V^, V^], where V^, partition the set of vertices 17(G). 

In the 2-party problem, Alice gets the inputs of the terminals in = Kn and Bob gets the inputs of terminals in 

= KnV^. Alice and Bob compute f^, the induced problem on the cut. A protocol ff solving / induces protocol 
If' for Alice and Bob as follows: let (5(C) be the set of cut-edges. As long as If does not send any bits across any 
edge in 5(C), Alice and Bob simulate If internally with no communication to each other. If If communicates bits 
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through edges in 5(C), Alice sends exacdy those hits to Boh that were sent in It from vertices in to vertices in 

via some pre-determined encoding in It'. Boh then sends to Alice the hits sent in the other direction in H. Thus, 
the 2-party problem gets solved in essentially the same cost as the total number of bits sent over edges of 5(C) by 
It. However, the simple thing to note is that if /'" is known to have large 2-party communication complexity of 
h(C), then that places a communication demand of b(C) across the cut C. 

We would like to say that we understand the communication bottlenecks in the graph, as is often done in 
analyzing network flows, by specifying this demand b[C) from our understanding of 2-party communication com¬ 
plexity. An obvious problem is the following: usually randomized 2-party communication complexity specifies 
“worst-case” complexity. The worst-case cost locally across each cut C may not correspond to a globally consis¬ 
tent input. It was observed in (TT) that there is a simple fix to this. We define a global input distribution /i such 
that the “expected” communication cost of the 2-party problem across cut C w.r.t the induced distribution pc is 
b(C). Then, the use of linearity of expectation helps us analyze the expected communication cost of the original 
problem. This idea was used in (TT) by using a special family of cuts obtained from Li embeddings of graphs. This 
worked weU to give 1-median type lower bounds, where the demand function b[C] was of a specific type. In this 
work, we want to deal with more varied demand functions. It turns out to be more convenient and (in hindsight) 
more natural for us to write these two-party communication constraints as a linear program (LP). This helps us 
not only to recover the bounds for the 1 -median type functions but also to obtain tight bounds for other types of 
functions. 

We illustrate the use of an LP in our setting by considering the bit-wise xor function: given inputs X‘ = (Xj,..., X^) e 

{0,1}" for every i e K, the functionXORn : ({0,1}”)^ —► {0,1}” is defined as follows: XOR^ ((X')jeif) = |(©/ej(rXjj ^|, 

where ® denote the boolean xor function. It is easy to see that we can compute this function by successively com¬ 
puting the bit-wise xor values of inputs along the Minimum Steiner tree for K, which implies an upper bound of 
0(ST(G,X) • n). We now show how one can prove an Q(ST(G,X) ■ n) lower bound forXORn. Let p be the hard dis¬ 
tribution that assign an independent and uniformly random vector from {0,1}” to each of the k terminals. Now fix 
any protocol 11 that correctly solves the XOR^ function on (G, K) on all inputs. Now consider a cut C of G that sep¬ 
arates the terminal set K. If one now considers the induced two party problem, it is not too hard to see that if Alice 
gets the vectors on one side of C and Bob gets the rest of the input then Alice and Bob are trying to solve the two- 
party bit-wise XOR function. In particular, Alice and Bob have two vector^ A,Be {0,1}” and they want to compute 
XORn (A, B). n thus induces a bounded error randomized protocol for Alice and Bob where they communicate only 
bits that n communicates on cut-edges. Further, the induced distribution pc on {A,B) is the uniform distribution 
on {0,1}” X {0,1}". It is not difficult to use an entropy argument and conclude that the two-party problem has an 
expected (under pc) communication complexity lower bound of at least a ■ n for some absolute constant a > 0. 
Now for every ee E, define Xe to be expected total communication through edge e by H under p. Then the argu¬ 
ment above and linearity of expectation implies that the expected total communication complexity of Ft (scaled 
down by a factor of a - n) is lower bounded by the objective value of the following linear program, which we will 
dub LPst(G,X): 

min ^ Xe 

e£E 


subject to 


Xe > 1 for every cut C that separates K 

e crosses C 

Xe > 0 for every ee E. 

By abuse of notation let LPst (G, K) also denote the objective value of the above LE It is well-known that LPsx (G, K) 
is 0(ST(G, K)] (see Theorem| 6 ), which with the above discussion implies the desired lower bound of 0(ST(G, K)-a- 
n) - Q(ST(G, K) ■ n) for the XOR^ problem. 

® A is the bit-wise xor of all the inputs on Alice’s side and B is the hit-wise XOR of all the input on Bob’s side. 
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C More Details on Graph Parameters 


It is well-known that ST(G,i<0 is closely related to LPsjiG,K) (see e.g. 1411 1: 

Theorem 6. 

LPst {G,K)<ST[G,K)<2- LPst (G, K) . 

The quantity iTfc(G) is closely related to the following LP, which we will duh LPmdn 

min ^ Xe 

beE 


subject to 

^ x,>min(|C|,|i<:\C|) 

e crosses C 

Xe>0 


for every cut C 
for every ee E. 


( 2 ) 


By abuse of notation let LPmdn(G, kl) also denote the objective value of the above LP. The following result was 
implicitly argued in [TT| . For the sake of completeness we present a proof in AppendixICl 


Theorem 7. 

L^^MDN(G,iiO>n(^^]. 

Proof. Let be the collection of cuts in G guaranteed by Bourgain’s embedding (Hl^ that has the following two 
guarantees: (i) Every edge is cut hy p - 0(log k) cuts in and (ii) for every uf u eV, the pair is separated by at 
least dciu, v) cuts in 

Using the constraint (2) over all cuts in we get that for the optimal solution (XeJeeE for LPmpnCG, K] (we use 
property (ii) ofin the third inequality) 


^ Z Xe> Z min(|C|,|r\C|) 

Ce<^ eeSlQ CeSg” 

^ |C|-|^\C| 

CeSg’ ^ 


CE'.g' 

1 


(u, u)luf ve K,C separates {u, v) 


^ I {C e IC separates (m, i 


UjtvEK 


dG(U,v) 


u^veK 


= T E E V] 


ueK veKjV^u 


1 ^ 


U€K 


> crjf(G). 


Finally by property (i) of we have that Zcesg" Zee5(C) ^ P'JleeE which with the above inequality implies that 
- trjc(G)//i, as desired. □ 

im also considered another graph parameter. Given the graph G = (.V,E), the subset of even number of termi¬ 
nals K and a partition M of IL into sets of size exactly two, define d[G,M)- Y.(u,v)^m dciu, v). The quantity d(G,Af) 
is related to the following LP, which we will dub LPmtch {G,K,M)\ 

min Y 

beE 
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subject to 


Xe>m{C,M) for every cute (3) 

e crosses C 

Xe > 0 for every eeE, 

where m{C,M) is the number of pairs in M separated by C. By abuse of notation let LPmtch (G,i<r,M) also denote 
the objective value of the above LP. The following result was implicitly argued in (TT) . For the sake of completeness 
we present a proof. 

Theorem 8. 

lpmjcu (G,j:,ap > n[ J. 

Proof. Let be the collection of cuts in G guaranteed by Bourgain’s embedding (Hl^ that has the following two 
guarantees: (i) Every edge is cut hy p - 0(log k) cuts in and (ii) for every uf v eV, the pair is separated by at 
least dciu, v) cuts in 

Using the constraint (3) over all cuts in we get that for the optimal solution (Xe)ee£ for LPmtch(G, K, M) (we 
use property (ii) of ^ in the last inequality) 

E E Xg > E |{(u, i^) e M|C separates (u, y)}| 

Ce'^ ee5(C) Ce<^ 

= E |{Ce S#’|C separates (u, f)}| 

{U,V)£M 

> Y dG[u,v) 

(u,v)£M 

= d[G,M). 

Finally by property (i) ofwe have that Xce‘^Xee( 5 (C) ^ P'Y.e&E which with the above inequality implies that 
ZgeE^e -'^(G,M)/; 6 , as desired. □ 

Finally, the quantities a k{G) and the worst-case d{G,M) are within a factor of 2 of each other: 

Lemma C.l ((H]). LetK be a set of even number of terminals and letJ{[K) denote the set of all disjoint pairings in 
K. Then 

1 

-■CTir(G)< max d{G, M) < a k[G). 

2 MeJtHQ 

D Sub - additivity Property 

We briefly argue that the two main families of functions that we consider in this paper lead to LPs that do satisfy 
the suh-additive property: 

• Steiner Tree constraints. There are sets of terminals Tt c V (for i e [£]) and h' (C) = 1 if C separates T,- and 0 
otherwise. Note that with these constraints LP^(G) and LP^(G) are the same as LP^j{G,K) and LP^j{G,K) 
that we saw in the introduction. 

• Multi-commodity flow constraints. We have a set of demands Di (for i e {(]) and h'(G) is the number of 
demand pairs in D,- that are separated by G. Note that with these constraints LP^(G) is essentially the sum 
of LPMDN(G,i<ri) where D/ consists of all pairs in AT/. 


E Applications 

In this section, we apply the general techniques we have developed so far to obtain lower bounds for specific 
functions. However, we begin with our lower bound for all functions. 
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E. 1 Lower Bound For Every Function 

We prove here that every function needs Cl (SI{K, G)) bits to be computed by any protocol. 

Theorem 9. Letf: ► {0,1} be any function that depends on all of its input symbols. Then, 

R[f, G,K,Y]>n{ST{G,K]). 

Proof. Let If be any protocol in which the node u in V (G) computes the output of /. Take any cut C of G that 
partitions V (G) into V^, and separates the set K of terminals into and K^, each of which is non-empty. We 
argue that at least one bit is communicated in total across the edges of 5(C). This will be sufficient to establish our 
theorem, using Lemma lE^ and Theorem!^ 

WLOG, assume u e is the designated terminal that needs to know the final output bit. As / depends on all 
its input symbols, there is an assignment aeY^ to terminals in such that / is determined by the assignment 
to terminals in K^, i.e. there exists b,b' e such that f{a,b) f{a,h'). Hence, when a is the assignment to 
K^, there is at least 1 bit of communication across 5(C) from to for u to output the answer correctly on all 
inputs with probability greater than 1/2. Otherwise, if no communication is expected by nodes in then the 
answer they give is independent of inputs to nodes in K^. In this case, at least for one of the assignments ab and 
ab', protocol H errs with probability at least 1/2. 

For every other assignment to as long as there is no communication from to V^, there is no way for 
processors in to know that the assignment to is not a. Hence, if they do not communicate in this case, they 
will also not communicate when is assigned a, which we argued is not possible. Thus, in every case, at least 1 
bit of communication occurs on 5(C). □ 

We note that we only use the property of a valid protocol that it cannot have a deadlock (i.e. if one end point of 
an edge is expecting to receive some communication then the other end point has to communicate something) in 
the proof above. The rest of our proofs do not use this property explicitly. 

Theorem |9] also has the following interesting consequence. Recall that in our model there is one designated 
terminal that needs to know the output bit. However, since the output bit can be transmitted to all the terminals 
with ST(G,i<0 amounts of additional communication, our model is equivalent (up to constant factors) to a related 
model where all the terminals need to know the output bit at the termination of the protocol. 


E.2 Bounds for DISJ 

We next prove bounds for one of the most well-studied functions in classical communication complexity: the set 
disjointness function (DISJ). 

We first note that for a given set of terminals K, where each terminal gets a subset of [n] (as a vector in {0,1}”), 
one can compute their intersection by computing the running intersection from the leaves of the minimum Steiner 
tree on in G to its root. Each edge only carries at most n bits, which leads to the following result: 

Proposition 1. 

R(D1SJ, G, js:, {0,1}”) < 0(ST(G,1«:) ■ u). 


Next, we argue that the bound above is nearly tight. Towards this end, we claim that 

Lemma E.l. Let h be the function h{M) - flogMl. Then DISJ is h-maximally hard on the star graph. 

Proof. Let Vj be the hard distribution for DISJ on an s-star from Braverman et al. (9|. Using standard tools of 
information theory, it follows from that work that for such hard distribution the set disjointness problem needs 
Cl{sn) expected communication over an s-star. 

Now consider any multicut C of kT with |C| = s. Now define as follows: let (Xi,. ..,Xs) be a sample from ps. 
Then the terminals in the /th explicit set in C all get Xj. Finally, all the terminals in the implicit set get the all ones 
vector. It is easy to check that has the required property. □ 

The above lemma along with Theoremj^immediately proves Theorem|3] 
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E.3 Composed Functions 

Finally we prove bounds for some composed functions. 


E.3.1 Bounds for ED o XORn 


In this section, we consider the composed function EDoXORn. For the sake of completenes, we recall its definition. 
Let G = {V,E) be the graph and given t subsets V (which need not aU be disjoint or distinct), we bave 

def 

JT = {Ki,...,Kt]. Given A:,- = \Ki\ n-bit vectors Xj,...,X^ e {0,1}” for every i e [t], define: 

EDoXOR„(x!,...,Xi^,...,X/,...,X^J = ED(xOR„(xJ,...,XiJ,...,XOR4x|,...,Xfcj). 

We now state the obvious upper bound for solving the EDoXOR^ function. For notational convenience, define 
(JKi,...,Kt (G) to be tbe minimum of CTjr(G) for every choice of K that has exactly one terminal from Kt for every i e [t]. 
Then we have the following upper hound. 

Proposition2. Letk-Y.\^iki. Then 


i?(EDoXOR„,G,,j?f,{0,l}”) < O 



Proof. Note that with 0(ST(G, Ki) ■ log k) amounts of communication, every terminal in Kf will know the hash0 of 
XOR„ (Xj Doing this for every ie [ t] gives the second term in the claimed bound. 

Let Ml,..., Mf be such that m,- e Ki for every i e [f] and c7Ki,...,KtiG) - c7{ui,...,ut}iG). Then run the upper bound 
protocol for ED using the hashes at the terminals in the set {mi, ..., Ut}. This latter part accounts for the first term 
in the claimed bound. This completes the proof. □ 


We will now prove Theorem[T] which is an almost matching lower bound to the upper bound in Proposition[2 
We wiU do so by proving two lower bounds separately: one each for the two terms in the upper bound. Note that 
this immediately implies a lower bound that is the sum of the two terms (up to a factor of 1/2) as desired. 

The first term follows immediately from existing results Hi] : 


Lemma E.2. 

R(ED oXORn, G, JT, {0,1}”) > n [ 

I logt 

Proof. Let m; e Kj for every i e [t] he such that o'Ki,...,k,(G) = . ,uj}(G). Let fit be the hard distribution from (TT] 

for ED on t terminals. Assign the t inputs from fit to each Mj and all the other terminals in get inputs that 

are distinct from each other and have a support disjoint from the support in fit. Then the lower hound for fit 
from HT] implies the claimed hound. □ 

Remark 1. We note that the proof can also be extended to replace GKi,...,Kt (G) by the maximum (G), where K' 
contains exactly one terminal from Ki,...,Kt. However, this does not lead to any contradiction since it is easy to 
check that (G) < a'Ki,...,Kt (G) + ST(G, Ki) and hence even if we use the stronger bound for above, the total 
lower bound does not exceed the upper bound. 

Next, we will prove a lower bound matching the second term in the upper bound in Proposition|2]up to poly-log 
factors. Before that we consider a specific problem that will be useful in the proof of our lower bound. 

Lemma E.3. Alice and Bob get t inputs Ai,..., At e {0,1}” andBi,...,Bt e {0,1}”. They want to computeED{K(yRn{Ai,B\),... ,XORn(Af 
Consider the distribution Vf where each Ai and Bj are picked uniformly and independently at random. Then for 
n > Slog t and any protocol with bounded error that computes ED(XORn(Ai,Ri), ...,XOR„(At,Rt)) correctly on all 
inputs has expected cost (underVt) ofQit). 

®In particular, here the hash is the inner product of 0(log k) random vectors with the input. The random vectors are generated using public 
randomness. 
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Proof. We will use the fact that the set disjointness problem where Alice and Bob get two sets of size t where each 
set is picked by picking t uniformly random elements (with replacement) from {0,1} “ has expected communication 
complexity lower bound of Q(f): see e.g. (11] 0 Let us call his hard distribution /if. 

Now for the sake of contradiction assume that there exists a protocol 11 that computes ED (XORn (Ai, ),..., XOR^ (Af, Rf)) 

correctly on all inputs with expected cost (under Vf) o(f). We will use this to obtain a protocol that solves the set 
disjointness problem above for sets of size tl2 with expected cost o(f) under /if/ 2 , which will lead to a contradic¬ 
tion. Let us assume that Alice gets and Bob gets {Fi,..., Ff/ 2 } from the distribution /it/ 2 . Alice and 

Bob construct the sets {Ai,..., Af} and {Ri,..., R/} as follows. Using shared randomness Alice and Bob both pick 
uniformly random elements Zi,..., Zf e { 0 , 1 }“ and compute their sets as follows: 


( XOR„(X/,Z/) ifi<t/2 
[ Zj otherwise 


and 


( Z/ if i < t/2 

I XOR„(F/_f/ 2 ,Z/) otherwise 


Note that the induced distribution on {Ai,...,At} and {Ri,...,Rf} is exacdy the same as Vf. Further, we have 
ED(XORff(Ai,Ri),...,XORn(At,Rt)) = 1 if and only if {Xi,. ..,Xti 2 } and {Fi,..., Ft/ 2 } are disjoint. Thus, if Alice and 
Bob run 11 on the inputs Ai,..., At and Ri,..., Rt as above, then they can solve the disjointness problem on inputs 
under the distribution /tt /2 with o(t) expected cost, as desired. 

□ 


We are now ready to prove a matching lower bound for the second term in the upper bound in Proposition^ 


Lemma E.4. 


R(ED oXOR„, G, Jir, {0,1}”) 


>n 


' I^,iST(G,X/) ' 
log|y|loglog|F|^ 


Proof. Consider the hard distribution /«, where each of the inputs in u ^ Kj is chosen uniformly and independently 
at random from {0,1}”. Now consider any cut G in the graph G. Let EL)oXORt,(G) denote the induced two-party 
problem. We claim that this problems needs n(f') amounts of expected communication where t' is the number 
of sets Ki that are separated by G. Assuming this claim, note that by Corollary|2]the effective lower bound for the 
entire problem is n(LP^(G)) where £ = t and (G) = 1 if Kj is cut by G and 0 otherwise (for any j e [t]). Further, 
note that the values (C) are sub-additive. By TheoremlH we have a lower bound of 


LP^(G) ) 
loglFIlogloglFlJ' 


To get the claimed lower bound, observe that the objective of LP^(G) is just the sum of LRst(G,X'j) for i e [t]. 
Finally, since we can minimize the objective of LP^(G) by separately minimizing each instance of the Steiner tree 
LP, Theorem| 6 ]implies that we have LP^(G) > O ST(G, Ki)), which implies the claimed lower bound. 

We complete the proof by arguing the claimed lower bound on the two party function ED oXORn (C) for any cut 
G. WLOG assume that G separates the sets Ki,...,Kf. Then note that if Alice gets the inputs from one side of the 
cut G and Bob gets the inputs from the other side then they are trying to solve ED(XORn(Ai,Ri),...,XORn(A(',R(/)) 
where A/ is the bit-wise XOR^ of all inputs in Ki that Alice gets and R/ is the XOR^ of the inputs from Ki that 
Bob gets. Further, note that the distribution on Ai,..., A^/ and is the same as the hard distribution in 

Lemma lEAl Thus, Lemma lEAl implies the claimed lower bound of Q(t')- □ 

^^Technically in m the hard distribution for set disjointness, the elements in the sets for Alice and Bob are chosen without replacement. 

r2U 

However, the probability that either Alice or Bob have a set of size strictly less than t or have an intersection is at most > which by our lower 

bound on n is negligible. 
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E.3.2 Bounds for XOR o IP 


There are multiple definitions of XOR o IP that make sense. In this subsection we wiU consider the version, which 
we dub XORo IP„, that gives the cleanest bounds. Given the set of terminals JT divided into t subsets of terminals 
be a set of disjoint pairings ofKt such that d(G,M/) = 0(crjf,.(G)) (by Lemma lC. 1 I such an Mi exists). 

dc f 

Given fc,- = \Ki\ u-bit vectors e {0,1}” for every i e [t], define: 

X0RoIP„(xJ,...,Xi^,...,X[,...,X^J=X0Ri(lPMi(^;.-.O.-.IPM.(X[,...,Xfcj), 

where XORi denotes the function that first applied XOR„ on the t vectors and then takes the xor of the resulting n 
bits and we consider the following version of the inner product function. Given a set of disjoint pairings M of K, 
define IPm : ({0,1}”)^ — ► {0,1}” as follows. Given inputs X' = (Xp...,X^) e {0,1}” for every i e K, IPm((^^ l/eic) = 

Now consider the obvious protocol to solve the XORoIP^: first compute all the IPm; \x^ ,..., j using the trivial 

(7 if; (G) ■ n protocol and then store the xor of the resulting n bits at say u; e Ki. At this point with O {X.\=i ^Ki (G) • n] 
bits of communication we have t bits at ut e Ki . Then we compute the final desired output bits by using the Steiner 
tree on {ui,..., mJ. This implies an overall upper bound of 

Proposition 3. Let Then 


R(XORoIP„,G,Xr,{0,l}”) < O 


ST(G,{Mi,...,Md) + E^ifi(G)- 


i=l 


n 

. 


Recall that Gorollary[T]shows a nearly matching lower bound. One can easily show a matching lower bound for 
the first term in the sum above (e.g. by the argument for XOR^ for n- \ from the introduction). We can also prove 
a nearly matching lower bound for the second term: 


Lemma E.5. 


R(XORoIP„,G,Xr,{0,l}”) 


>n 


logfc 


To prove this we wiU need the following result (the proof appears in Appendrx|E2): 


Lemma E.6. Let and be defined as above. For any j e \£\ define y (C) is defined to be the 

number of pairs in Mj separated by the cut C. Then fork-Yfi^i\Ki\, 


n 




logfc 


<LP^(G)<LP^(G)< O 


e 

V'=l 


ofLemmc AF.51 Let p be the distribution where the k- ki vectors are picked uniformly and independently 
at random from {0,1}”. Let G be an arbitrary cut of G and let fc( be the number of pairs in M/ that are cut by G. 
Then note that the induced two-party problem is essentially trying to solve the two-party inner product function 
on (X \^ik'j)-n bits. Further, conditioned on all valid fixings of inputs corresponding to pairs that are not separated 
by G, the remaining inner product problem mentioned above corresponds to Alice (who receives all the vectors 
on one side of C) receiving a uniform vector with (Z;=^ k'f) ■ n uniform bits. SimUarly for Bob. It is well-known 
[m that for this induced distribution the two party lower bound on the expected cost is k'f\ ■ n) bits of 

communication. 

Note that by Corollary|2]the effective lower bound for the entire problem is n(LP^(G)) where £ - t and y (C) = 
k'j ■ n. Lemma lRBl completes the proof. □ 
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E.3.3 Bounds for XOR o ED 


In this section, we consider in some sense the “reverse” of the ED o XOR^ function. The function XORi o ED ; 

({0,1}”)“^ —► {0,1} is defined as follows. Let the set of terminals JT he divided into t subsets of terminals Ki,..., Kf. 
dsf 

Given ki = |i<r,| n-hit vectors e {0,1}” for every i e [t], define: 

X0RioEd(4,...,x1^,...,X[. xI)^^ed[xI...,xQ. 


Now consider the trivial two-step protocol that results in the following upper bound: 
Lemma E.7. Choose t terminals ui e Ki for every ie [ f]. Then 




R(XORioED,G,Jir,{0,l}”)< O 


ST(G,{mi,. 





i=l 




Proof. Using the argument in proof of Proposition^ with O (tKi (G) • log A:) bits of communication, every Ui 
knows the value of ED|Xp...,X^ j. Then the resultingXORi can be computed with 0(ST(G,{Mi,...,Mf}) bits of 
communication by progressively computing the XORi along the corresponding Steiner tree. □ 

Recall that Theorem shows a nearly match ing lower bound. We can have matching lower bound term for 
the first term in the sum above from TheoremThe more interesting part is to prove matching lower bound 
for the second term. Towards that end, we will need a result on classical 2-party Set-Disjointness: let UDISJ« the 
unique-set disjointness problem on 2n bits that has the following promise. Alice and Bob get n-bit strings such 
that they have at most one occurrence of an all-one column in their inputs, i.e. their sets have at most one element 
in common. They want to find out if their sets intersect. Pair the input bits of Alice and Bob as (Xi, Yi),..., (X„, F„). 
Each pair (X;, T, ) is sampled independently from a distribution p that we describe next. To draw a sample (17, V] 
from p, we first throw a uniformly random coin D. If D = 0, 1/ is fixed to 0 and V is drawn uniformly at random 
from {0,1}. If D = 1, the roles of U and V are reversed. The following result was observed by [11], using the seminal 
work of Bar-Yossef et al. [5]: 


Theorem 10. LetYl be any 2-party randomized protocol solving UDlSJn with bounded error e < 1/2. Then, its ex¬ 
pected communication cost w.r.t. input distribution p” is at least (l - 2v^)(?i/4). 

We are now ready to prove a nearly tight lower bound for the second term in Lemma lKTl 

Lemma E.8. For n > log fc -t 2, 


R(XORi o ED, G, JT, {0,1}”) > Q 


logfc 




Proof. We assume for convenience that each \Ki\ is even. Consider pairing M; of nodes in Ki, for each i such that 
d[G,Mi) > {\l2)-aKi (G) (such an Mi exists thanks to Lemma lCUi . Let M be the multi-set union with |M| = 

kl2 = m. Now for ease of description, we notate the inputs at the pairs of terminals in M as (Xi, Yi),..., (Xm, Ym). 
We fix the first log k bits of each of the pair of terminals Xj, Yj to a string aj e {0,1}*°® ^ such that aj f ai for if j. 
We call aj the prefix string of its pair. In the ensuing discussion we look at only restricted inputs, where the first 
log k bits of the inputs of each terminal are fixed to its respective prefix string. To keep notation simple, we still 
notate the unfixed bits of the Ith pair in M as {Xi, Yi). 

We now describe the remaining input distribution: Let be three distinct strings in {0,1}” where 

n' - n- log k. Such three strings exist because of our assumed bound on n. Define auxiliary random variables 
Di,...,Dm that are i.i.d and each takes value in {0,1} uniformly at random. Then if Di - 0, set Xi - s® and Yj 

^^Technically, we get a lower bound of f2(ST(G, which of course implies a lower bound of n(ST(G, [ui,..., ut])). 
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takes uniformly at random a value in {Sy, If D; = 1, then F/ = and Xi at random takes value in {s®. This 
completes the description of our input distrihution that we denote by 3). 

We wdl show that we can invoke Lemma|R^using distribution 3>. To do so, we analyze the expected commu¬ 
nication cost of any protocol 11 solving XORi o ED on G, across a cut C. Let number of pairs of M; cut by C be m'. 

and m' = m[-\ - 1 - m'^. Let Xfc denote the set of terminals whose mate in M is separated by C. Let JLc = JF \ JLc. 

Consider any assignment a to the terminals in that is supported by S> and let the induced protocol be denoted 
by ria. We claim that we can solve unique (2-party) set-disjointness over m' bits using Ha as follows: Alice and 
Bob associate each of their co-ordinates with a separated pair in Me- Alice and Bob both replace their I’s by the 
string Alice replaces her O’s by and Bob replaces his by Then they simulate Ha and communicate to each 
other whenever and whatever !!« communicates across C. It is simple to verify that this way Alice and Bob can 
solve unique Set-Disjointness: if there is no aU-l column in their input, Ila outputs r mod 2 w.h.p, where r is the 
number of M/’s that are separated by C. If they do have a (unique) all-1 column, !!„ outputs (r - 1) mod 2 w.h.p. 
Further, the distribution induced on inputs of terminals in Xfct when Alice and Bob’s input distribution is sampled 
from p"* (recall definition of p from Theorem[T0), is precisely the distribution 3 induces on conditioned on 
a assigned to inputs in J^c- 

Thus, by Theorem[T0] the expected communication of Ila over the cut edges of C is O(m'), for any a. Hence, 
expected communication of H over C is Q(m'). Corollary |2]and Lemma lEBl complete the proof. 

□ 


F Omitted Proofs from Section 13711 

F. 1 Proof of Theorem|4] 

We first state a simple property of sub-additive values: 

Lemma F.l. Let G = {V, E) be a tree and let b' (C) for i e [(] be the constraint values for LP^(G) that satisfy the sub- 
additive property. For any edge ee E, let Ce denote the cut formed by removing e from G. Then for any cut ofG we 
have 

^ b‘{Ce)>b‘{C). 
ee(5(C) 

Proof. This follows from the fact that G = Ugeg/o Ce (since G is a tree) and the definition of the sub-additive prop¬ 
erty. □ 

In the rest of this subsection, we will prove TheoremlH We begin with the upper bound in TheoremlH which is 
trivial. 


Lemma F.2. For any graph G, 


LP^(G)<LP^(G). 


Proof. Consider any feasible solution for LP^(G), where x/ = (x/,e)eeB- Then note that the vector x = [XglecE 

defined as 

e 

i=l 

is also a feasible solution for LP^(G). □ 


To complete the proof, we now focus on proving the lower bound in TheoremlH We first begin by observing 
that the two LPs are essentially the same when G is a tree: 

Lemma F.3. For any tree T = (.V,E] (and values h' (G) for any i e [(] and cut C with the sub-additive property), we 
have 

LP^(r) = LP^(r). 
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Proof. The proof basically follows by noting that for a tree T, we only need to consider some special cuts. In 
particular, for every edge ee E, let Cg denote the cut that only cuts the edge e (In other words, the two sides of the 
cut are formed by the two subgraphs obtained by removing e from T). 

We first claim that 

e 

LP^(r)> ^ ^h‘(Cg). 
eeii i=l 

To see this consider any feasible solution x e IR^ for LP^ ( T). We have from the constraint on Cg for every eeE that 


Xg > 


e 

Zb‘iCe). 


i=l 


Summing the above over all e e £ completes the claim. 
Finally, we argue that 


e 

LP^(r)< ^ ^h'(Cg), 
eei? !=1 


which with Lemma lE2] will complete the proof Consider the specific vector {x,};e|^] such that for every i e {(] and 
e e £, we have 


x,,g = b\Ce]. 


Note that the proof will be complete if we can show that the above vector is a feasible solution for LP ^ (T). Notice 
that by the fact that T is a tree the above vector indeed does satisfy all the constraints corresponding to the cuts Cg 
for every eeE. Now consider an arbitrary cut C. Indeed we have for every i e {(]: 


^ X;,g= Y. b‘{Ce)>b‘{C), 
e^SiC) eeSiC) 

where the inequality follows from Lemma lEll 


□ 


Thus, we are done for the case when G is a tree. For the more general case of a connected graph G, we will just 
embed G into one of its sub-tree with a low distortion. This basically follows a similar trick used in |3]. We say a 
graph G embeds with a distortion a on to (a distribution @ on) its subtrees such that for every {u, v) e V, we have 
a ■ dgiu, v) > [djiu, y)]. (Note that for every sub-tree T of G, we have djiu, v) > dgiu, v].) 

We will now prove the following result: 

Lemma F.4. Let G- (V, £) embed into its subtrees under distribution with distortion a. Then we have 

LP^(G)> --LP^IG). 
a 

Proof. Using the embedding trick of (3), we will show that there exists a subtree T of G such that 

LP^(r) < a ■ LP^(G) and LP^(G) < LP^(r). 


Note that the above along with Lemma |E3] completes the proof Further, note that the second inequality in the 
above just follows from the fact that T is a sub-tree of G. Hence, to complete the proof we only need to prove the 
first inequality. 

A word of clarification. When we talk about the constraints in LP^(r) and LP^(G), we have the same h' (G) 
value for each cut. However, note that the set 6 (C) could be different for G and T. 

Towards this end, consider an optimal solution x e for LP^(G). From this, we will construct a feasible solu¬ 
tion x' E for LP^(r) whose expected cost is bounded, i.e. 




E 4 

eeElT) 


< a- Y ^e- 
eeB(G) 


Markov’s inequality will then complete the proof 

Finally, we define the solution x' for LP^(r). Consider the following algorithm (for any given T): 


( 4 ) 
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1. Initialize Xg ^ 0 for every e E i?. 

2. For every e = {u, v] e E such that Xg > 0 do the following 

(a) For the unique path Pu,v that connects u and v in F do the following 
• For every e' e Pu,v: do x^, ^ x', + Xg. 

We first argue that the vector x' computed hy the algorithm above is a feasible solution to LP^ (F). Consider an 
arbitrary cut C in G and consider any e - {u, v) e S(C] such that Xg > 0. Now consider the same cut C in F. Note 
that in this case there has to be at least one edge e' e P„ ^ such that e' e S(C] in F. Thus, we have 

E E Xe^Y.^\C), 

e'E5j-(C) e£5c[C):Xe>Q !=1 

where the last inequality follows since x is a feasible solution for LP^(G). Thus, we have shown that x' is a feasible 
solution. 

Finally, we prove (4). Note that by the algorithm above, we have 

E Xg = E dT{u,v]-Xe. 
e£E(T) e={u,v)£E(G) 

Now (4) follows from the above, linearity of expectation and the fact that G embeds with a distortion of a under 

3i. □ 

It is known that any graph G = (.V,E] can be embedded into a distribution of its subtrees with distortion 
0(log|y|loglog|y|) (see e.g.(3[T|), which in turn proves Theorem|4@ 

Remark 2. It is natural to wonder if one can use embedding of a graph into a distribution of trees (instead of sub¬ 
trees as we do) and not lose the extra loglog | V\ factor (since for trees one can get a distortion o/0(log 11^|) IWP . We 
do not see how to use this result: in particular, in our proof of LemmafE^ we do not see how to guarantee that the 
vector (Xg)ge£(j’) satisfies the corresponding LP^{T) constraint. In short, this is because the edges in T for the result 
in fldl l have weights (say We for every e e E{T))so we can no longer prove the (stronger) inequality Y.e'^ST{C) We-x'g> 
Jle€SG{Q.Xe>0 ^e- 


F.2 Proof of LemmaESl 

Proof of Lemm diKB The inequality LP^(G) < LP^(G) follows from Lemma lE2] 

We begin with the last inequality. Towards this end we present a feasible solution for LP^(G). Fix an i e \£\. 
Now consider the following algorithm to compute x;_g for e e F: 

• X(,e ^ 0 for every e e F. 


• For every [u, v) e M,-, let Pu.v be a shortest path from u to v in G. For every e e Pu,v, do x,_g ^ x,_g + 1. 

It is easy to check that the vector computed above satisfies ZeeE = d(G, M;) < 0[a (G)) (where the inequality 
follows from our choice of M/). Now consider any cut C. For every pair iu,v]e Mi that is cut by C, the chosen path 
Pu,v will cross C at least once. This implies that the vector (x,',e) satisfies all the relevant constraints. This implies 
the claimed upper bound of LP^(G) < 0(Z;=i 

We finally, argue the first inequality. We first note that LP^(G) is exactly the same as FFmtch (G> M) (where 

M is the (multi-set) union of Mi,..., M/). Thus, we have 


LP^(G) = FFmtch(G, Jr,M) > O 


d[G,M)] ^ 

[ltid(G,Mi)] 

>n 

flilfTiqlG)'! 

- 

log A; j 

i logfc 


i logfc 


^^The result in is not stated as distribution over sub-trees but rather the paper presents a deterministic algorithm to compute a tree T that 
has low weighted average stretch. In our application, this means that the algorithm can compute a tree T such that given weight Xe for e e EiG), 
it is true that T.e={u,v)EE{G) ^ ^'T.eEE{G) ^ei which is enough for the rest of our proof to go through. 
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where the first inequality follows from Lemma|8](and noting that its proof also works for the case when M is a multi¬ 
set), the second equality follows from the fact that M is the multi-set union of and the last inequality 

follows from our choices of M,-. This completes the proof. □ 


F.3 Relating Communication Complexity Lower Bounds to LP^ (G) 

We now make the straightforward connection between two party lower bounds and LP^(G). In what follows con¬ 
sider a problem p = (/, Further for any cut C in graph G, we will denote by fc the two-party problem 

induced by the cut: i.e. Alice gets aU the inputs from terminals in ^ that are on one side of the cut and Bob gets 
the rest of the inputs. Finally, for a distribution p over let pc be the induced distribution on the inputs on two 
sides of the cut. 

Lemma F.5. Let p - , Z) be a problem and pbea distribution on such that the following holds for every 

cutC in G 

e 

£>i/3,mc(/c)5E^'(C),B (5) 

then the following lower bound holds 

i?(p)>LP^(G). 

Proof. Let IT be an arbitrary protocol that correctly solves the problem p = (/, G, J?r, Z) with error at most e = 1/3. 
For a given input Y e Z"^, let Ce(F, If) denote the total amount of bits communicated over the edge e e E[G] for the 
input Y. For every e e E{G), define 

Xe = Ey^^[Ce(F,n)]. 

Note that by linearity of expectation Zee£(G) denotes the expected cost of If on p. Further, if we can show that 
the vector x = {Xe)e€EiG'j as defined above is a feasible solution for LP^(G), then the expected communication cost 
of If will be lower bounded by LP^ (G). The claim then follows since we chose If arbitrarily. 

To complete the proof, we need to show that x satisfies all the constraints. It follows from definition that > 0 
for every ee E{G). Thus, to complete the proof we need to show that for every cut C 

e 

E Xe>Eb'(C). (6) 

ee5(C) i=l 

Towards this end fix an arbitrary cut C and consider the following protocol Ifc for the induced two-party function 
fc- Alice runs If by herself as long as If only uses messages on edges on Alice’s side of the cut C. If If needs to send 
a message over d(C), then Alice sends the corresponding message to Bob. Bob then takes over and does the same. 
Ifc terminates when If terminates. It is easy to check that Ifc is a correct protocol for fc and errs with probability 
at most e. Further, the total communication for ffc for an input Y e Z*^ is exactly 

E c,(F,n). 

eeSiC] 

Thus, by linearity of expectation, the expected cost of lie under pc is Zee5(C) ^e- This along with (5) proves 0, as 
desired. □ 


The above immediately implies the following corollary: 

Corollary 2. Let p- {f,G,J^, Z) be a problem and pbea distribution on Z“^ 
cutG in G 

t 


Di/ 3,Ale (/c) 5 « 


Eb'(C) 


Vi=i 


such that the following holds for every 


for some value a > 0 then the following lower bound holds 


Rip) > a-LP^CG). 

^^For a two party function / and a distribution ^ on the inputs of f, we will use De,^(/) to denote the minimum expected communication 
cost over the distribution for the worst-case inputs over all protocols that compute / with probability at least 1 — c on every input. 
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Next, we show we can use Corollary|2]to reprove the following lower hound on the ED function: 


Theorem 11 IfTTll. 


j?(ED,G,ji:,{o,i}")>n 


logfc ) ■ 


Proof. Let /i he the distrihution that picks k random vectors without replacement from {0,1}”. It was shown in fTTl 
that for for every cut C of G, we have Di/ 3 ,^(,(EDc) > n(min(|C|, |kr\ C|). The claim then follows from CoroUary[3 
(for / = 1), Theorem|7]and noting that with the constraints above LP^(G) is the same as LPMBNiG,K). □ 


G Proof of Theorem [5] 


G. 1 A collection of multi-way cuts 

We will consider multi-way cuts of a graph G = {V,E). For our purposes a multi-way cut C of G is a partition of V 
into at least two sets. For notational convenience, we will list all hut one set of a multi-way cut C: i.e. the “missing" 
set will he implicitly defined by the set V \ UsecS- Just for concreteness, we wiU caU the sets explicitly mentioned 
in G is explicit sets and the missing set to be the implicit set. (Note that this implies that the size of a multi-cut |G| 
is the number of explicit sets in C.) Also d (G) denotes the set of cut-edges of G: i.e. the set of edges that have one 
end point in one explicit set of G and the other end point in another set (explicit or implicit) of C. 

Given two multi-way cuts G and G' of G, we say that G is contained in G' is every explicit set of G' is the union 
of one or more explicit set of G (and maybe some extra elements from the implicit set of G). 

We now define a family of collection of multi-way cuts that will be useful in proving our lower bounds. 

Definition 1. We call a family of collection of multi-way cuts to be {£, a) -multicut family for G if the 

following is true for every i e [£]. (For every i e [£], lePWi = ..., where each is a multi-way cut for G.) 

(i) fContainment property! For et/ery 1 < j<tni,GP is contained in G^^ 

(ii) (Oisjointness property! For eyeryl £ ji f h - tni,d^CP^'^ andS^CP^'^ are disjoint. 

(Hi) (Singleton property! Call an explicit set S in cP for any j e [m;] to be singleton ifS contains exactly one set 
from cP. Then CP''‘ has at least a ■ cP singleton explicit sets. 


G.2 Multicut family to a lower bouud 

Next we show how an (.£, a) -multicut family implies a lower bound for certain functions. We begin with the specific 
class of functions. 

Recall that /: —► {0,1} is h-maximally hard on the star graph if the following holds for any multicut C of K. 

f f 

There exists a distribution p^ such that the expected cost (under pf) of any protocol that correctly computes / 
on any star graph where each of the leaves has terminals from an explicit set from C (and the center contains the 
implicit set of G) is Q(|G| ■ h(|Z|)). 

Lemma G.l. Lef^ be an [£, a) -multicut family for G such that every (explicit) set in gJ'^* has at least one terminal 
from K in it and let f: ^ {0,1} bean h-maximally hard on the star graph function. Then 


R[f, G,K,1)>D. 


' g-h(IZI) 

£-\ogk 



Proof. Fix an i e {£]. We will define a hard distribution pi for terminals in K such that the expected cost of com¬ 
munication over all the crossing edges in the multi-way cuts in for any correct protocol will be 


O 


—^■a-h{\I.\]-mi- 

logfc 



(7) 
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Note that by picking the final hard distribution /i = | l^i’ will complete the proof. 

To complete the proof, we argue (7). Let 5 be the number of singleton sets in Then by the containment 

property of this implies that there exist explicit sets Ti,...,Ts e such that for every 1 < 7 < [mj], has 

f 

s singleton sets that contain T\,...,Ts respectively. We then let /i; be j j, where we think of {Ti,..., Tg] as a 
multicut on K. 

By the definition of ^ j, we get that the expected amount of communication on the cut edges d j (for 
any j e [m;]) is Q(sh(|2|)/logfc)0 Since the cut edge sets are disjoint for any two cuts and by linearity of 

expectation, the expected cost over all edges in is n(sm, h(|2;|)/logfc). The proof is complete by noting 

that the Singleton property of implies that s > a ■ | |. □ 

G.3 Constructing the multicut family 

The main result in this section is to show that we can construct a good multicut family. 

Lemma G.2. For any given instance (.G,K) there exists an - 0(logA:), a - 113)-multicut family for G such that 
= {{ilU £ K}: i.e. all the explicit sets in just contain one terminal from K. Further, we have 

e m . I 

>n(ST(G,r)). 

/=lj=l' 

Note that Lemmas lG.ll and rG.2| prove Theorem^ 

In the rest of the section, we prove Lemma IG.2I We will in fact first define a collection of multi-way cuts 
Ci,...,Cf for some f > 1 such that they satisfy the containment and disjointness properties in Definition [1] for 
( - \ (but not necessarily the singleton property). Further, these cuts satisfy the two extra properties needed in 
Lemma [G!2] Finally, we will show how to divide the collection of multicuts into 0(log k) sub-collections so that the 
new family is actually an (O(logfc), l/3)-multicut family for G (without losing the other desired properties). 

We start with a notation that will help us define our multi-way cut family. For any non-empty subset S c y, let 
SSg(S, r) denote the set of all vertices in G with a (shortest path) distance of at most r from some node in S. More 
precisely: 

SSg(S, r)-{uE V\ there exists a we S such that dciu, w) < r}. 

We will define the multicuts Ci,..., Cf by defining a partition of K for each i e [t]: let us call the ith partition 
Given the partition the definition of the multicut Ci is simple: there is one explicit set in Ci corresponding 
to each S e 5^-. In particular, for every S e 5^-, we have 

Gi = {^G(S,i-l)|SE5^-}, 

where recall we only state the explicit sets in the multi-way cut G,-. 

Thus, to complete the descriptions of the multi-way cuts, it is enough to show how to compute S^i. £F\ is defined 
to be the partition of K into the k singleton sets {(} (for every i eK). To compute S^i+i from we first construct a 
graph G'. which has one node for every SeS^i. Add an edge (S, T) for T f SeS^i in G'- if 3Sg(S, i) intersects SSg [T,i). 
For each connected component in G'., add the union of all sets from J^i in the connected component as one set in 
, 5 ^+ 1 . Note that it is possible that ,5^+i = <5^-. The last index t is defined as the smallest index such that 1 = 1- 
Note that the containment and disjointness properties of the multi-way cuts Gi,..., Cf follow from construc¬ 
tion. Further, by definition, all the explicit sets in Gi contains exactly one terminal from K. Next we argue that 

Lemma G.3. 

^lG;|>i-ST(G,J:). 

^^The definition implies a lower bound on a star but it is easy to see that any protocol on any connected graph can be simulated on a star 
graph with only a 0(logk) blowup in the total communication. In particular, consider the following simulation. When a message needs to be 
sent from one of the k nodes u to another u, the leaf corresponding to u in the A:-star uses 0(logk) bits to identify the leaf corresponding to u 
to the center so that the center can relay the original message from u to u. 
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Proof. Let G denote the complete graph on the vertex set K, where the edge [u, v) in G has a cost of dciu, v). Let 
T(G) denote an MST of G. It is easy to see the cost of r(G) (denoted hy COST(r(G))) is at least ST(G,f<r). Next we 
argue that |C, | is at least half of the cost of T{G], which would complete the proof. 

Intuitively, the argument about the cost of TIG) is essentially that our algorithm to compute the various 5^- 
simulates a run of Bonivka’s algorithm for computing an MST of G. 

We will now prove the result hy induction on k Ikil. When k - 2, then it is easy to see that ICil - 
COST(r(G)) - 1 > COST(r(G))/2, as desired0Let us assume that the claim is true for all K with I^Tl = A:> 2. 

Next consider the case when \K\-k+ 1. Let i he the smallest index where the graph G'. j has at most k compo¬ 
nents (i.e. this is the first i such that at least two singletons sets from are merged when computing 5^). Let G' 
denote the graph where we collapse all nodes in Sg',- into “super-nodes" and let K' denote the corresponding set of 
terminals in G': i.e. K' is in one to one correspondence with Let C[,. .., C', denote the cuts defined if our algo¬ 
rithm ran on G' and K'. We claim two properties; (i) IQI |Cj| = (k-i-1) ■ (i-1) and (ii) the corresponding 

graphic' has its MST cost (denoted hy COST(r(G'))) to he at least COST(r(G)) -2fc(i - 1). Note that claims (i) and 
(ii) complete the inductive step of the proof[3 To complete the proof, we argue these two claims. 

We begin with claim (i). We first note that the multi-way cut C'j (for j e [t']) is in one to one corresnondencJ^ 
with Ci+j-i - In particular, we have | C'. | = | Ci+j-i\. The claim then follows by noting that all = SP\ for (. < i (and 
hence IQ| = (fc-i- l)(i - 1)). 

We finish by arguing claim (ii). The main observation is that T[G) can be obtained by starting with T{G') and 
then replacing each super node in T[G') by a spanning tree of the corresponding component of G'. j (recall that 
each super node in K' is constructed by collapsing a component in G'- j of size at least two). To complete the 
claim, we need to track the changes in edge weights. We first note that the cost of edges in G' is smaller than the 
corresponding edge in G by exactly 2(1-1). Second, each edge added back for each super node in K' has cost at 
most 2(1 - 1). This implies that 

COST(r(G)) - COST(G') = {\K'\- 1) ■ 2(1 -1) + {\K\ - liL'l) ■ 2(1 - 1) < 2k{i - 1), 


as desired. □ 

Note that now we have shown an (1,1 / fc)-multicut family that satisfies all the other conditions in Lemma [G^ 
We now present a simple way to convert this into an (O(logfc), l/3)-multicut family. In particular, we will group 
^ = 0(logA:) consecutive chunks of multi-way cuts from Ci,...,Cf to obtain our final family We first 

show how we compute Let j be the largest index in [t] such that has at least kl3 singleton sets. Then 
'^1 = {Cl,..., Cj]. Now note that \-5^]+\ \ ^ 2fc/3 (because it has at most kl3 singleton sets and the rest in the worst- 
case might form subset of size 2). We now re-start the process from Cj+i, where we think of as the set of 
terminals. If this process stops in £ steps note that this results in an {£, 1/3)-multicut family. Recall that once we 
go from to constructing the number of terminals decreases by a factor of at least 3/2. This in turn implies 
that £ = O(logfc), as desired. 


^®The inequality holds as long as COST(r(G)) > 2. If COST(r(G)) = 1, then note that G is just a unit cost edge with the two endpoints being the 
two terminals. Note that in this case |Cil = 1 and hence the inequality |C;| > COST(r{G))/2 still holds as required. 

^®In particular, G' is a complete graph on .5^,- and the cost of an edge [u', u') is dgiu', v') — 2(i — 1), where dQ(u', v') is the distance between 
the closest pairs of terminals in u' and v' (recall that u' and v’ correspond to disjoint subsets ofK). 

^^It can be verified that our construction of C;, ... ,Ci on G corresponds to running our algorithm on G' with the terminal set K'. This implies 
(e.g. by induction) that£(_. |C/| > COST(r(G'))/2. Hence by (i) wehave£f_, |C, | > (A:+l)(i-l)+COST(rCG'))/2 a (i:+l)C!-l) + COST(r{G))/2- 

J—l J I—i 

kii -1) > COST(r(G))/2+ (z - 1) > COST(r(G))/2, where the second inequality follows from (ii). 

^^This follows by our earlier observation that we can think of the construction of C/,..., Cf as running our algorithm on G^ with the terminal 
set being K' . 
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